Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi Zhuang Autonomous Region

To control the HIV/AIDS epidemics in Guangxi Zhuang Autonomous Region in China, Guangxi government launched the 5-year Guangxi AIDS Conquering Project (GACP, Phase I: 2010-2014, Phase II: 2015-2020). In the project, three measures are implemented, such as great improvements of the coverage of HIV/AIDS education, promotion of HIV voluntary counseling and testing, and enhancement of antiretroviral treatment. In this paper, we explore the effects of the three measures of GACP by construction of a Susceptible-Infected-Diagnosed-Treated population compartments model and via evaluation of the basic reproduction number derived from the model. A computational framework is developed for estimating the model parameters based on the HIV surveillance data, with application of the Markov-Chain Monte-Carlo method and Nonlinear Least Squares method. By estimating the new infections and evaluating the basic reproduction number, we find that the implementation of the three measures of GACP has a significant effect on controlling the rise of HIV/AIDS cases and the epidemic trend. Compared with HIV voluntary counseling and testing, strengthening HIV/AIDS education and expanding the coverage of antiretroviral treatment show a greater impact on HIV/AIDS epidemic control, which provides a reference project for other provinces with a similar epidemic situation in Guangxi Zhuang Autonomous Region. At the same time, our research fills the current research gap for the evaluation of large-scale AIDS prevention and control projects in developing areas.


Introduction
Acquired immune deficiency syndrome (AIDS) is a disease caused by human immunodeficiency virus (HIV) infection [1][2][3]. It is a world-widely spreading infectious disease. The and screening strategies is the best cost effective strategy. Mukandavire et al. [16] demonstrate the use of sex-structured HIV/AIDS compartment models in assessing the effectiveness of condom use as a preventative strategy in a heterosexually active population. Nyabadza et al. [14] propose and analyze a HIV/AIDS model incorporating public-health information compaigns. The results demonstrate that an increase in the rate of dissemination of effective public-health information campaigns results in a decrease in the prevalence of the disease. The results of those studies are based mainly on the stability analysis of the dynamical models, which lacks the support of surveillance data under these implementations.
In this paper, based on the surveillance data of HIV/AIDS case reports, testing and treatment data from Guangxi Zhuang Autonomous Region, we explore the effects of the three measures of GACP project by construction and fitting of a Susceptible-Infected-Diagnosed-Treated compartment model. The three measures include prevention (improvements of the coverage of HIV/AIDS education and reinforcement of behavior intervention), HIV testing (promotion of HIV voluntary counseling and testing, and expansion of HIV testing in medical institutions), and enhancement of ART. In the model, the populations are grouped into the susceptible population, the HIV-infected population (HIV-positive), the diagnosed HIV-positive population, the population under antiretroviral treatment and the population who drop out of the treatment. To effectively simulate the spreading and controlling dynamics of the HIV, we set some parameters in the model to be time dependent, such as the prevention rate, the diagnosis rate and the ART enrolment rate. Based on the HIV surveillance data and the compartment model, we establish a mathematical computational framework for estimating the parameters of the model, evaluation of the basic reproduction number and assessment of the HIV/AIDS epidemic in Guangxi Zhuang Autonomous Region. The Markov-Chain Monte-Carlo method and Nonlinear Least Squares method are applied for parameter estimation in the computational framework. Based on the computational framework, the effects of the three measures of GACP project are evaluated.
Although there are many studies that apply mathematical models to investigate the prevention and control of the disease, there is still not much research on the mathematical model of HIV/AIDS transmission dynamics based on the real-world surveillance data. In this paper, the data-driven mathematical model is well applied to predict the AIDS epidemic situation in Guangxi Zhuang Autonomous Region, and moreover, the implementation effect of the prevention (publicity and intervention), testing and ART measures are well evaluated simultaneously based on the surveillance data and the model. The results of the study could provide reference for developing future provincial HIV/AIDS prevention and control strategies for Guangxi Zhuang Autonomous Region and other provinces with similar epidemic situations.
A list of abbreviations and terms used in the paper is given in Table 1.

Data
The data used in this paper comes from the National HIV/AIDS Comprehensive Information System of China's disease prevention and control information system and the data are not publicly available. The population data comes from the published data of Guangxi Statistical Bureau [24]. The data including HIV/AIDS case report, treatment and intervention data, are collected by local health institutions, including county and township CDC, hospitals and clinics, and reported to Guangxi CDC and China CDC through prefectures and municipal health institutions. Those data included in this study are summary statistics, which do not contain any identifiers that could link the data to individual subjects in the local HIV/AIDS surveillance systems. The mathematical modeling framework is approved by the Guangxi Institutional Review Board (GXIRB2015-0008).

The mathematical model
We develop a compartmental model for the HIV transmission dynamics with differential equations. We group the total population into five compartments:  Table 2.  We denote the number of susceptible people at time t by S(t), the number of HIV-positive people who are not diagnosed at time t by I(t), the number of diagnosed HIV-positive people who are not under ART at time t by D(t), the number of diagnosed HIV-positive people that are under ART at time t by T(t), and the number of people who drop out of ART at time t by G(t). The time unit used in the model is per year to align with the available data.
Susceptible individuals (S) are assumed to be recruited at rate Λ and undergo death at rate d S . Susceptible individuals become infected through contacts with HIV-positive individual (I) in an incidence λ(t)S(t), where N(t) indicates the total population size N(t) = S(t) + I(t) + D(t) + T(t) + G(t). The dynamic of the susceptible individuals (S) is given by where the infection incidence λ(t) is related to the transmission rate of HIV and the contact rates with infectious compartments, including the infected (I), diagnosed (D) and treated (T) and those drop out of the treatment (G). Here we take where c denotes the contact rate, and β I , β D , β T and β G are transmission coefficients of compartments I, D, T and G, respectively. The improvement of the coverage of HIV/AIDS education and the prevention measures could reduce the contact rate, which is indicated by the term 1 − η(t).
The infected individuals (I) either undergo death at rate d I , or are diagnosed through HIV testing at rate α(t). The dynamics of I(t) reads dI dt ¼ lðtÞSðtÞ À d I IðtÞ À aðtÞIðtÞ: The promotion of HIV voluntary counseling and testing measures in the GACP project has effect on the HIV testing rates α(t), so we assume it to be time-dependent.
The diagnosed individuals (D) either undergo death at rate d D or are enrolled into the ART program at rate γ(t). The dynamics of D(t) follows where α(t)I(t) is the annual number of new reports. The enhancement of the antiretroviral treatment would have influence on the treatment enrolment rate γ(t), which is assumed to be time dependent.
The individuals under antiretroviral treatment (T) either undergo death at rate d T or drop out of treatment at rate δ, thus where γ(t)D(t) is the number of new treatment enrollments. The drop-outs include those due to treatment failure or loss of follow-up. The drop-out individuals (G) undergo death at rate d G , so that The meanings and values of the parameters are illustrated in Table 3.
With the implementation of the Four Frees and One Care program in 2003, the Chinese government had rapidly scaled up HIV testing and ART, which was reflected by an increase in the number of HIV tested people and in the number of treatment centers. The surveillance data for Guangxi Zhuang Autonomous Region is available from the year 2005. The 5-year Guangxi AIDS Conquering Project (GACP) was launched from 2010. To correctly adjust for the increase in new HIV testing and ART, we use a time-dependent prevention rate η(t), diagnosis rate α(t), ART enrolment rate γ(t) and model them by continuous piecewise linear functions. We assume that ( where γ(t) will be fitted with ART enrollment surveillance data. Estimated

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi

The basic reproduction number
Our model was used to calculate the reproduction number R C , which is the actual average number of secondary cases per primary case caused observed in a population for an infectious disease in the presence of control measures. The reproduction number includes the effect of intervention measures, which varies as the epidemic progresses with time. Since our model parameters are time dependent, the reproduction number is suitable indicator of the strength of the transmission dynamics. In our model, effects of the national programs (Four Frees and One Care) after 2003 and GACP in Guangxi Zhuang Autonomous Region after 2010 are incorporated into the time dependent prevention rate η(t), diagnosis rate α(t) and ART enrolment rate γ(t). The impact of the programs during the time period 2005-2020 is measured by the time-varying reproduction number R C where β = cβ I , q 1 = cβ D /β I , q 2 = cβ T /β I and q 3 = cβ G /β I . The mathematical derivation of the basic reproduction number and the interpretation are given in Appendix B.

Framework
In this paper, we construct a computational framework [26] (in Fig 2) to analyze the surveillance data, calibrate the mathematical model, and then evaluate the effect of GACP and put

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi forward suggestions for the prevention and control of HIV/AIDS epidemic in Guangxi Zhuang Autonomous Region. Firstly, we use the surveillance data to estimate the death rate of the D, T and G population, the dropout rate δ and the ART enrolment rate γ(t) directly. Secondly, we establish differential equations model to describe the dynamic behavior of different groups. Then we calibrate the model parameters by MCMC and nonlinear least square methods. Finally, we try to assess the HIV/AIDS epidemic in Guangxi Zhuang Autonomous Region.
In addition, the reproduction number is obtained by the mathematical model. By analyzing the influence of different prevention and control measures on the reproduction number, we obtain valuable suggestions for the prevention and control of HIV/AIDS epidemic in Guangxi Zhuang Autonomous Region.

Statistical analysis
Parameter estimation using surveillance data. Demographic parameters Λ and d N are estimated by fitting the equation of the total population to the population data of Guangxi Zhuang Autonomous Region from 2005 to 2020. We obtain that Λ = 720000 and d S = 0.0054. The result is shown in Fig 4A. Values of the parameters d D , d T , d G , δ, and γ(t) are estimated directly from the surveillance data. From the data of total survivals of HIV-positive patients and the deaths, and the total treated and death, we estimate the fatalities of d D and d T to be d D = 0.1134, and d T = 0.0308. Based on the data of new treatments and drop-outs per year, we estimate the dropout rate to be δ = 0.0533, and the ART enrolment rate γ(t) to be where β = cβ I , q 1 = cβ D /β I , and q 2 = cβ T /β I . The nonlinear least squares method is applied to find the point estimates for mathematical model parameters and the initial value for compartment I at the end of 2005, which minimize the summation of squared error between model output and the available surveillance data.
The estimated values are shown in Table 3.
Before using the nonlinear least squares method, we use the MCMC method [22] to obtain the 95% confidence interval of some parameters. The MCMC method provides the iterative starting point and 95% confidence interval basis for the nonlinear least square method, so as to prevent the nonlinear least square method from falling into the local optimal solution in a large interval and unable to find the optimal solution in a small interval, which can effectively fill the gap between statistical data and the nonlinear least square method. The details of the MCMC method are provided in the Appendix A.
Fitting effect. The goodness of the fitting is accounted by coefficient of determination [27] where SS res represents the error between the statistical data y and the best fit dataŷ, which is the sum of squares of residuals, also called the residual sum of squares, that is

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi and SS tot indicates the error between the statistical data and the average of the statistical data � y, which is proportional to the variance of the data, that is The models are well fitted between the simulated data and the surveillance data (R 2 = 0.90), and the fitting effects are shown in Fig 4. Here, the simulation data of individuals in the diagnosed compartment is not well fitted with the surveillance data after the implementation of the GACP (see Fig 4B). This may partly due to the data collection and calculation of individuals in the diagnosed compartment, while the annual diagnosed data and annual treatment enrolment data are more reliable which are well fitted as shown in Fig 4E and 4F, respectively.

Evaluation of the effects of GACP on the epidemic in Guangxi Zhuang Autonomous Region
Under the implementation of GACP and the National Program (Four Free and One Care), the Guangxi government has strengthened HIV testing, prevention and treatment; the numbers of

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi HIV testing and treatment have greatly increased in Guangxi Zhuang Autonomous Region, and furthermore, the publicity and education were strengthened, as well as the promotion of condoms.
Preventive measures include warning publicity and education, actively promoting various publicity activities, carrying out many publicity lectures, and strengthening the publicity of floating population, in addition, interventions including needle-syringe exchange, Methadone maintenance treatment, peer education, condom promotion are provided to injecting drug users (IDUs); condom promotion and peer education are provided to sex workers and men who have sex with men (MSM). HIV testing measures include voluntary counseling and testing towards high risk groups like IDUs, sex workers and their clients, MSM and spouses of HIV cases, and moreover, the range of HIV testing in medical institutions is expended, such as the addition of pre-operation testing, pre-marital testing, pregnancy and childbirth testing and blood donation testing. Antiviral therapy measures include improving the quality of medical services, free ART, early initiation of ART regardless of their CD4+ counts level.
We utilize the calibrated model to assess the implementation effect of GACP; the results are shown in Fig 5 and  At the initial stage of GACP implementation, the number of people initiated ART (T) is briefly higher than that without GACP (from 2010 to 2022). This is due to the implementation of GACP, that is, the potential HIV/AIDS patients are treated; with the advancement of time, the number of HIV/AIDS patients receiving treatment effectively increased. Similarly, in the process of implementing GACP, the number of people who drop out of ART (G) is temporarily higher than the number of G without GACP (from 2010 to 2022). This maybe due to the increase in the number of AIDS patients receiving treatment, and more AIDS people giving up treatment for various reasons, such as economic, suicide [28]. With the continuous implementation of GACP, the number of patients giving up ART has been effectively controlled. According to our model estimates, through the implementation of GACP, Guangxi Zhuang Autonomous Region had reduced 52% newly reported HIV infections, 17.2% HIV/AIDS survivors and 42.2% deaths in 2020, as shown in Table 4. At the same time, we find that from 2010 to 2020, the amount of the above reduced percentage are increasing every year.  Table 5.

Sensitive analysis for R C and alternative intervention scenarios
In 2005, the most sensitive parameters for R C are the transmission coefficient β D and death rate d D of the diagnosed population (compartment D). As HIV testing and ART coverage increases, more HIV-positive people are diagnosed and treated, in the case of limited medical resources, which also means that more patients drop out treatment (for economic reasons, suicide reasons [28] or other reasons). Therefore, in 2010, the sensitivity of transmission coefficient β D and death rate d D to R C gradually decreased in the diagnosed population (compartment D), while the sensitivity of transmission coefficient β G and death rate d G gradually increased in the population dropped out of ART (compartment G). In 2015, with the increase in the number of drop-outs, the transmission coefficient β G and death rate d G of the population dropped out of ART (compartment G) became highly sensitive to R C .
In order to further investigate the impact of prevention, diagnosis and treatment measures on HIV/AIDS epidemic, we compared the effects of the three different measures on the reduction of the basic reproduction number R C (−). In contrast to the improvement of diagnosis rate α(+ %) and ART enrolment rate γ(+ %), the improvement of the protection rate (education

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi rate) η(t) has a great influence on the reduction of the reproduction number R C , as shown in Fig 7. We further take the non-implementation of GACP as the control sample, to compare different implementation scenarios applying the calibrated model. Compared with the control   Table 5. https://doi.org/10.1371/journal.pone.0270525.g006

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi sample, we construct seven hypothetical scenarios, as shown in Table 6. Scenario 7, for example, includes antiretroviral treatment under the national program, prevention and HIV testing under the GACP program, and so on in other scenarios. The simulation results are shown in Fig 8 and Table 6. The maximum reduction (to 1.3527) and the minimum reduction (to 1.6413) of R C are achieved by Scenario 8 and Scenario 3,

PLOS ONE
respectively. In addition, we find that that there is little difference between Scenario 1 (to 1.6510) and Scenario 3 (to 1.6413) for reduction of the reproduction number. Similarly, Scenario 2 (to 1.4872) and Scenario 5 (to 1.4786), Scenario 4 (to 1.5117) and Scenario 7 (to 1.5016), Scenario 6 (to 1.3618) and Scenario 8 (to 1.3527) have similar effects. That is to say, diagnosis α has some effect on the reproduction number R C in the former period, but later on has little influence on the reproduction number R C compared with prevention η and treatment γ.
In addition, we simulate the number of new HIV infections (in 2020) with different scenarios. We observe that Scenario 4 (37.2%) has a better effect on reducing the number of new HIV infections than Scenario 2 (21.6%) and Scenario 3 (2.3%). It is worth mentioning that although ART could effectively reduce the number of new infections, it is on the premise of large-scale HIV testing. Without large-scale HIV testing, ART cannot be accurately administered to the target population.

Discussion
The current studies of AIDS epidemics lack research on the evaluation of large-scale AIDS control projects based on surveillance data. Our study is the first to model and evaluate the implementation effect of the GACP project based on surveillance data in Guangxi Zhuang Autonomous Region, China. The GACP project is an effective attempt to prevent and control AIDS by the local government in China. Our research provides abundant surveillance data,  Table 6. https://doi.org/10.1371/journal.pone.0270525.g008

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi prediction data and model support for the AIDS epidemic in Guangxi Zhuang Autonomous Region. Furthermore, it provides an important reference for future AIDS epidemic control and public health policies in developing regions.
In order to estimate the epidemic trend of HIV/AIDS in Guangxi Zhuang Autonomous Region, China, we establish a mathematical model assuming homogeneous population mixing in different cities in this area based on the surveillance data. In particular, in order to access the impact of HIV/AIDS transmission in different populations, the mathematical model is a Susceptible-Infected-Diagnosed-Treated population compartments model. Based on this model, we put forward a mathematical framework for evaluation, including analysis of surveillance data, calibration of mathematical model, calculation of the reproduction number, prediction of the HIV/AIDS epidemic and evaluation of the effectiveness of the GACP program.
We observe that the GACP program effectively reduced the number of HIV infections and the number of HIV related deaths. In contrast, the implementation of GACP has little effect on the susceptible people, which is reasonable, since the implementation of GACP will not affect the natural growth of the population. The basic reproduction number R C is derived from the mathematical model and estimated based on HIV/AIDS surveillance data in Guangxi Zhuang Autonomous Region. The sensitivity analysis (as shown in Fig 6) show that the transmission coefficients β G and death rate of people who drop out of ART (compartment G) have a great influence on R C over time. Further numerical simulations (as shown in Fig 7) show that measure of the HIV-testing as a single-strategy would have a limited impact on the reduction of the basic regeneration number R C , that is, increasing in HIV testing rates α(t) alone while maintaining the prevention rate η(t) and treatment rates γ(t), would have only a slight impact on the number of HIV/AIDS cases in the near future. This result further implies that how to promote HIV/AIDS publicity and education and improve the ART enrollment rate has become an important issue in HIV/AIDS prevention and treatment. Therefore, it is important for the Guangxi government to retain the current free antiretroviral treatment, and try to allocate treatment resources to allow HIV/AIDS patients to receive continuous antiretroviral treatment.
In order to estimate the parameters more accurately, we apply the classical machine learning method, namely the MCMC method [22,29]. Admittedly, our results have some limitations. First of all, the estimation results of this study are based on surveillance data. We ignore the specific transmission mode of HIV in the model. In fact, after 2005, the main mode of HIV transmission in Guangxi Zhuang Autonomous Region has changed greatly. Before 2006, the mode of HIV transmission in Guangxi was mainly caused by injecting drugs, while since 2006, heterosexual sex has become the dominant mode of HIV transmission, followed by drug injection [30]. The estimation results using a simplified model may underestimate the transmission of HIV. Secondly, our model does not consider the heterogeneity of the HIV-positive population. For example, the estimated values of transmission rates β I , β D , β T , β G are the average values; the age structure [31] and sex structure [16] of the population are not considered. For example, older heterosexual adults seem to have a higher risk of HIV infection [30,32]. In view of the significant spatial variability of HIV infection, the reproduction number under heterogeneous mixing may be greater than that under homogeneous mixing. The model needs further development and refinement to take into account the heterogeneity within different groups in order to make a more accurate estimate of the HIV/AIDS epidemic. In addition, as the expansion of antiretroviral treatment is expected to improve the survival rate of HIV/AIDS patients, the death rate of patients receiving antiretroviral treatment is expected to decrease over time (as shown in Fig 3B). However, we adopt a constant death rate in our model, which may lead to a slight overestimation of R C . Finally, it should be noted that our Susceptible-Infected-Diagnosed-Treated population compartments model is idealized and does not consider the spatial factor, that is, the mobility of AIDS patients between provinces; Guangxi Zhuang Autonomous Region is located at the border between the two countries, so our model does not consider the transmission input from other countries (such as the Golden Triangle region). Therefore, the model could be improved to a patch model between different regions.
However, our results strongly suggest that the implementation of GACP will lead to an overall decline in HIV infection. Therefore, enhanced interventions and local governmentbased HIV/AIDS support programs should be widely implemented in areas with severe epidemics to effectively reduce HIV spreading in the Chinese mainland.

Appendix A. Markov Chain Monte Carlo method
Markov chain Monte Carlo method [22,29] was used to obtain 95% confidence intervals for model parameters β I , d I , d D , q 1 , q 2 , I 0 . We take an example of the process of obtaining the 95% confidence interval of parameter β I . It can be obtained for other parameters in the same way.

Sensitive analysis of SSE
Let SSE(β I ) is the sum of squared errors between statistical data y and model outputŷðb I Þ, that is, where N is the number of data points.
We performed sensitivity analysis with respect to SSE for the parameters β I , d I , d D , q 1 , q 2 , I 0 . Following the method of Latin Hypercube Sampling, we generated 100,000 samples to calculate the Pearson correlation coefficient between each parameter combinations and SSE.
The results are shown in The posterior distribution of parameter β I from data y Assume that the errors introduced in statistical data are normally distributed, that is, whereŷðb I Þ is the model output given the parameter β I , � is a random variable with multivariate normal distribution, that is where S = diag(σ 2 , σ 2 , . . ., σ 2 ) and σ 2 is a random variable with inverse Gamma distribution.
Based on the above assumptions, the likelihood function is calculated as follows

PLOS ONE
Predicting the HIV/AIDS epidemic and measuring the effect of AIDS Conquering Project in Guangxi we can obtain that Using Bayesian theorem, we can further obtain the posterior distribution of parameter β I from data y, Obviously, it is difficult to derive p(β I |y) directly, we use Markov-Chain Monte-Carlo method to approximate it.

Metropolis-Hastings algorithm [33]
Suppose p(β I |y) is the target probability distribution. The process of Metropolis-Hastings algorithm is as follows: • If u � A(β I0 , β I1 ), then reject the new state and copy the old state forward β I(t+1) = β I0 • Increment: set t = t + 1.
Similarly, we obtain the probability density function of all parameters in Fig 10. Therefore, we get the 95% confidence interval of the above parameters in Table 7.

Appendix B. The basic reproduction number
Based on the method in [35,36], we calculate the basic reproduction number under GACP.
We suppose that F i is the rate of appearance of new infections in compartment i; R þ i is the rate of transfer of individuals into compartment i; R À i is the rate of transfer of individuals out of compartment i. Then we have  Based on Lemma 1 in [35], the derivatives DF and DR are partitioned as we obtain and so that R C ðtÞ ¼ rðFV À 1 Þ ¼ ð1 À ZðtÞÞ b d I þ aðtÞ 1 þ q 1 aðtÞ d D þ gðtÞ þ q 2 aðtÞgðtÞ ðd D þ gðtÞÞðd T þ dÞ þ q 3 aðtÞgðtÞd ðd D þ gðtÞÞðd T þ dÞd G � � : where β = cβ I , q 1 = cβ D /β I , q 2 = cβ T /β I and q 3 = cβ G /β I .